The Disk - Covering Method for Tree Reconstruction 631

نویسندگان

  • Daniel Huson
  • Scott Nettles
  • Laxmi Parida
  • Tandy Warnow
  • Shibu Yooseph
چکیده

Evolutionary tree reconstruction is a very important step in many biological research problems , and yet is extremely diicult for a variety of computational, statistical, and scientiic reasons. In particular, the reconstruction of very large trees containing signiicant amounts of divergence is especially challenging. We present in this paper a new tree reconstruction method, which we call the Disk-Covering Method, which can be used to recover accurate estimations of the evolutionary tree for otherwise intractable datasets. DCM obtains a decomposition of the input dataset into small overlapping sets of closely related taxa, reconstructs trees on these subsets (using a \base" phylogenetic method of choice), and then combines the subtrees into one tree on the entire set of taxa. Because the subproblems analyzed by DCM are smaller, com-putationally expensive methods such as maximum likelihood estimation can be used without incurring too much cost. At the same time, because the taxa within each subset are closely related, even very simple methods (such as neighbor-joining) are much more likely to be highly accurate. The result is that DCM-boosted methods are typically faster and more accurate as compared to \naive" use of the same method. In this paper we describe the basic ideas and techniques in DCM, and demonstrate the advantages of DCM experimentally by simulating sequence evolution on a variety of trees.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Disk-Covering, a Fast-Converging Method for Phylogenetic Tree Reconstruction

The evolutionary history of a set of species is represented by a phylogenetic tree, which is a rooted, leaf-labeled tree, where internal nodes represent ancestral species and the leaves represent modern day species. Accurate (or even boundedly inaccurate) topology reconstructions of large and divergent trees from realistic length sequences have long been considered one of the major challenges i...

متن کامل

Solving Large Scale Phylogenetic Problems using DCM2

In an earlier paper, we described a new method for phylogenetic tree reconstruction called the Disk Covering Method, or DCM. This is a general method which can be used with any existing phylogenetic method in order to improve its performance. We showed analytically and experimentally that when DCM is used in conjunction with polynomial time distance-based methods, it improves the accuracy of th...

متن کامل

Scaling up accurate phylogenetic reconstruction from gene-order data

MOTIVATION Phylogenetic reconstruction from gene-order data has attracted increasing attention from both biologists and computer scientists over the last few years. Methods used in reconstruction include distance-based methods (such as neighbor-joining), parsimony methods using sequence-based encodings, Bayesian approaches, and direct optimization. The latter, pioneered by Sankoff and extended ...

متن کامل

Quartet-Based Phylogeny Reconstruction from Gene Orders

Phylogenetic reconstruction from gene-rearrangement data is attracting increasing attention from biologists and computer scientists. Methods used in reconstruction include distance-based methods, parsimony methods using sequence encodings, and direct optimization. The latter, pioneered by Sankoff and extended by us with the software suite GRAPPA, is the most accurate approach; however, its exha...

متن کامل

Sequence-Length Requirements for Phylogenetic Methods

We study the sequence lengths required by neighbor-joining, greedy parsimony, and a phylogenetic reconstruction method (DCMNJ+MP) based on disk-covering and the maximum parsimony criterion. We use extensive simulations based on random birth-death trees, with controlled deviations from ultrametricity, to collect data on the scaling of sequence-length requirements for each of the three methods as...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998